# modified scripts to allow different inputs:
# source(here::here("networks/Manhattan_plot.R"))
system(paste("RScript",
here::here("networks/Manhattan_plot.R"),
"-p networks/MS.pvals.out",
"-o networks/ms_manhattan_plot.pdf"
)
)
system(paste("RScript",
here::here("networks/Manhattan_plot.R"),
"-p networks/HT.pvals.out",
"-o networks/ht_manhattan_plot.pdf"
)
)
# source(here::here("networks/qqplot.R"))
system(paste("RScript",
here::here("networks/qqplot.R"),
"-p networks/MS.pvals.out",
"-o networks/ms_qqplot.pdf"
)
)
system(paste("RScript",
here::here("networks/qqplot.R"),
"-p networks/HT.pvals.out",
"-o networks/ht_qqplot.pdf"
)
)
The MS GWAS is more powerful than the HT GWAS. We can infer this different by noting that more variant associations are above the y=x line in the MS QQplot, and more variants above the horizontal line of significance threshold in the MS Manhattan plot.
The above plot is of the full parent network. After running analysis in cytoscape, the resulting summary information and plots were produced:
Number of nodes: 8960
Number of edges: 27724
Avg. number of neighbors: 6.363
Network diameter: 13
Network radius: 7
Characteristic path length: 4.382
Clustering coefficient: 0.088
Network density: 0.001
Network heterogeneity: 2.063
Network centralization: 0.033
Connected components: 164
Analysis time (sec): 34.270
Most nodes in the main Parent PPI network have very few connections (low degree), as can be seen in the network degree plot. The ‘power-law’ shape of this plot also indicates that the network doesn’t follow a random distribution - but could follow a scale-free or hierarchical distribution.
The general upward trend of the betweeness by degree distribution indicates that a small number genes in this network are hubs, or common bridges linking other genes in the shortest path. This could perhaps indicate that the network is hierarchical, rather than scale-free.
#source(here::here("networks/Pathway_permutations.R"))
system(paste("RScript",
here::here("networks/Pathway_permutation.R"),
"-p networks/parent_PPI.sif",
"-o networks/q4_pathway_permutation.pdf"
)
)
The above plots indicate that:
Overall, these observations suggest that networks from GWAS are more connected that would be expected (although, this is much more true for MS than HT).
Looking at the top GO term enrichments by BINGO analysis (above) immune system related processes clearly emerge as important.
The above plots suggest that the MS GWAS identified many more genes already known to involved in its pathogenesis, than the HT GWAS did. In other words, the HT GWAS identified a greater proportion of genes that had yet to be characterised as contributing to this phenotype - and therefore this GWAS contributed a greater proportion of new knowledge.
#source(here::here("networks/Pathway_permutations.R"))
system(paste("RScript",
here::here("networks/Pathway_permutation.R"),
"-p networks/Directed_PPI.sif",
"-o networks/q7_pathway_permutation.pdf"
)
)
*Obvs some problem with plot …
The above plots indicate that:
The null hypothesis to test is: The number of controllable genes in the MS-associated first order network is consistent with random sampling of controllable genes from the full directed network.
# total number of nodes in the MS associated gene first order network
# this is k in the hypergeometric parameters
k = 546
# is the total number of controllable nodes in the network
# i.e. dispensible + indispensible
m = 3677 - 8 # is the number of unlabelled nodes
# n is the total number of nodes in the directed network
n = 6338- m
# then the value to test for is the number of controllable
# ms associated genes
q = 317
phyper(k= k,
lower.tail = F,
m = m,
n = n,
q = q)
## [1] 0.4493194
This p-value suggests that MS-associated genes are not enriched for controllable genes. This is unsurprising given the proportion of controllable genes out of the MS-associated genes (317/549 \(\approx\) 58%) is very similar to the proportion of controllable gene in the entire directed network (3667/6338).